Search CORE

165 research outputs found

Adaptive Importance Sampling in General Mixture Classes

Author: A. Doucet
Arnaud Guillin
C. Robert
Christian P. Robert
D. Peel
J. Geweke
Jean-Michel Marin
M. Oh
M. West
O. Cappé
O. Cappé
Olivier Cappé
R Development Core Team
R. Chen
R. Douc
R. Douc
R.Y. Rubinstein
Randal Douc
T. Hesterberg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the importance sampling performances, as measured by an entropy criterion. The method is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performances of the proposed scheme are studied on both artificial and real examples, highlighting in particular the benefit of a novel Rao-Blackwellisation device which can be easily incorporated in the updating scheme.Comment: Removed misleading comment in Section

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

HAL AMU

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL UVSQ

Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data

Author: DB Dunson
EC Chi
G Heinrich
MD Hoffman
MI Jordan
O Cappé
TG Kolda
Publication venue
Publication date: 18/08/2015
Field of study

We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conjugacy in the model and enables simple and efficient Gibbs sampling and variational Bayes (VB) inference updates, with a computational cost that only depends on the number of nonzeros in the tensor. The model also provides a nice interpretability for the factors; in our model, each factor corresponds to a "topic". We develop a set of online inference algorithms that allow further scaling up the model to massive tensors, for which batch inference methods may be infeasible. We apply our framework on diverse real-world applications, such as \emph{multiway} topic modeling on a scientific publications database, analyzing a political science data set, and analyzing a massive household transactions data set.Comment: ECML PKDD 201

arXiv.org e-Print Archive

Crossref

Practical Open-Loop Optimistic Planning

Author: D Silver
D Silver
D Silver
J-F Hren
L Buşoniu
O Cappé
R Bellman
R Coulom
Publication venue
Publication date: 09/04/2019
Field of study

We consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies - i.e. sequences of actions - and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KLOLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

A Novel Document Generation Process for Topic Detection based on Hierarchical Latent Tree Models

Author: DJ Bartholomew
DM Blei
DM Blei
G Lubke
J Paisley
J Pearl
MA Sato
O Cappé
P Chen
T Liu
Publication venue
Publication date: 27/06/2019
Field of study

We propose a novel document generation process based on hierarchical latent tree models (HLTMs) learned from data. An HLTM has a layer of observed word variables at the bottom and multiple layers of latent variables on top. For each document, we first sample values for the latent variables layer by layer via logic sampling, then draw relative frequencies for the words conditioned on the values of the latent variables, and finally generate words for the document using the relative word frequencies. The motivation for the work is to take word counts into consideration with HLTMs. In comparison with LDA-based hierarchical document generation processes, the new process achieves drastically better model fit with much fewer parameters. It also yields more meaningful topics and topic hierarchies. It is the new state-of-the-art for the hierarchical topic detection

arXiv.org e-Print Archive

Crossref

Kernel Sequential Monte Carlo

Author: A Doucet
A Gretton
AT Ihler
C Andrieu
H Haario
JS Rosenthal
N Chopin
N Chopin
O Cappé
O Cappé
P Fearnhead
P Moral Del
Publication venue: Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Publication date: 22/06/2017
Field of study

We propose kernel sequential Monte Carlo (KSMC), a framework for sampling from static target densities. KSMC is a family of sequential Monte Carlo algorithms that are based on building emulator models of the current particle system in a reproducing kernel Hilbert space. We here focus on modelling nonlinear covariance structure and gradients of the target. The emulator’s geometry is adaptively updated and subsequently used to inform local proposals. Unlike in adaptive Markov chain Monte Carlo, continuous adaptation does not compromise convergence of the sampler. KSMC combines the strengths of sequental Monte Carlo and kernel methods: superior performance for multimodal targets and the ability to estimate model evidence as compared to Markov chain Monte Carlo, and the emulator’s ability to represent targets that exhibit high degrees of nonlinearity. As KSMC does not require access to target gradients, it is particularly applicable on targets whose gradients are unknown or prohibitively expensive. We describe necessary tuning details and demonstrate the benefits of the the proposed methodology on a series of challenging synthetic and real-world examples

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

A population Monte Carlo scheme with transformed weights and its application to stochastic kinetic models

Author: A. Bain
A. Doucet
A. Doucet
A. Golightly
A. Jasra
A. Kong
B. Shen
C. Andrieu
C.P. Robert
D. Wilkinson
D. Williams
D.T. Gillespie
E. Koblents
E. Marinari
Eugenia Koblents
J. Carpenter
Joaquín Míguez
M.F. Bugallo
O. Cappé
O. Cappé
P. Djuric
P. Milner
P. Moral Del
R. Douc
R. Gramacy
R.J. Boys
S. Boucheron
T. Bengtsson
V. Volterra
W. Hoeffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/08/2012
Field of study

This paper addresses the problem of Monte Carlo approximation of posterior probability distributions. In particular, we have considered a recently proposed technique known as population Monte Carlo (PMC), which is based on an iterative importance sampling approach. An important drawback of this methodology is the degeneracy of the importance weights when the dimension of either the observations or the variables of interest is high. To alleviate this difficulty, we propose a novel method that performs a nonlinear transformation on the importance weights. This operation reduces the weight variation, hence it avoids their degeneracy and increases the efficiency of the importance sampling scheme, specially when drawing from a proposal functions which are poorly adapted to the true posterior. For the sake of illustration, we have applied the proposed algorithm to the estimation of the parameters of a Gaussian mixture model. This is a very simple problem that enables us to clearly show and discuss the main features of the proposed technique. As a practical application, we have also considered the popular (and challenging) problem of estimating the rate parameters of stochastic kinetic models (SKM). SKMs are highly multivariate systems that model molecular interactions in biological and chemical problems. We introduce a particularization of the proposed algorithm to SKMs and present numerical results.Comment: 35 pages, 8 figure

arXiv.org e-Print Archive

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Recovering the state sequence of hidden Markov models using mean-field approximations

Author: Antoine Sinton
Cappé O
Elahi E
Eltoukhy H
Eltoukhy H El Gamal A
Ewing B
MacKay D J
Mézard M
Nyrén P
Wang Y Zhou L Feng J Wang J Liu Z-Q
Publication venue: 'IOP Publishing'
Publication date: 09/06/2009
Field of study

Inferring the sequence of states from observations is one of the most fundamental problems in Hidden Markov Models. In statistical physics language, this problem is equivalent to computing the marginals of a one-dimensional model with a random external field. While this task can be accomplished through transfer matrix methods, it becomes quickly intractable when the underlying state space is large. This paper develops several low-complexity approximate algorithms to address this inference problem when the state space becomes large. The new algorithms are based on various mean-field approximations of the transfer matrix. Their performances are studied in detail on a simple realistic model for DNA pyrosequencing.Comment: 43 pages, 41 figure

arXiv.org e-Print Archive

Crossref

Localizing the Latent Structure Canonical Uncertainty: Entropy Profiles for Hidden Markov Models

Author: G Brushe
G Celeux
G McLachlan
J-B Durand
J-B Durand
J-B Durand
Jean-Baptiste Durand
M Crouse
O Cappé
PA Devijver
S Lauritzen
T Cover
W Zucchini
Y Ephraim
Y Guédon
Yann Guédon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/02/2012
Field of study

This report addresses state inference for hidden Markov models. These models rely on unobserved states, which often have a meaningful interpretation. This makes it necessary to develop diagnostic tools for quantification of state uncertainty. The entropy of the state sequence that explains an observed sequence for a given hidden Markov chain model can be considered as the canonical measure of state sequence uncertainty. This canonical measure of state sequence uncertainty is not reflected by the classic multivariate state profiles computed by the smoothing algorithm, which summarizes the possible state sequences. Here, we introduce a new type of profiles which have the following properties: (i) these profiles of conditional entropies are a decomposition of the canonical measure of state sequence uncertainty along the sequence and makes it possible to localize this uncertainty, (ii) these profiles are univariate and thus remain easily interpretable on tree structures. We show how to extend the smoothing algorithms for hidden Markov chain and tree models to compute these entropy profiles efficiently.Comment: Submitted to Journal of Machine Learning Research; No RR-7896 (2012

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Agritrop

HAL-CIRAD

Ergodicity, Decisions, and Partial Information

Author: A. Bellow
A.B. Nobel
A.W. Vaart van der
D. Berend
D. Pollard
D.J. Rudolph
H. Kunita
H. Totoki
J. Hoffmann-Jørgensen
J. Neveu
N. Etemadi
O. Cappé
O. Kallenberg
P. Chigansky
P. Moral Del
P. Walters
P.H. Algoet
R. Handel van
R. Handel van
R. Handel van
R. Handel van
R.M. Dudley
S. Meyn
T. Weissman
V.A. Volkonskiĭ
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/08/2012
Field of study

In the simplest sequential decision problem for an ergodic stochastic process X, at each time n a decision u_n is made as a function of past observations X_0,...,X_{n-1}, and a loss l(u_n,X_n) is incurred. In this setting, it is known that one may choose (under a mild integrability assumption) a decision strategy whose pathwise time-average loss is asymptotically smaller than that of any other strategy. The corresponding problem in the case of partial information proves to be much more delicate, however: if the process X is not observable, but decisions must be based on the observation of a different process Y, the existence of pathwise optimal strategies is not guaranteed. The aim of this paper is to exhibit connections between pathwise optimal strategies and notions from ergodic theory. The sequential decision problem is developed in the general setting of an ergodic dynamical system (\Omega,B,P,T) with partial information Y\subseteq B. The existence of pathwise optimal strategies grounded in two basic properties: the conditional ergodic theory of the dynamical system, and the complexity of the loss function. When the loss function is not too complex, a general sufficient condition for the existence of pathwise optimal strategies is that the dynamical system is a conditional K-automorphism relative to the past observations \bigvee_n T^n Y. If the conditional ergodicity assumption is strengthened, the complexity assumption can be weakened. Several examples demonstrate the interplay between complexity and ergodicity, which does not arise in the case of full information. Our results also yield a decision-theoretic characterization of weak mixing in ergodic theory, and establish pathwise optimality of ergodic nonlinear filters.Comment: 45 page

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Interacting Multiple Try Algorithms with Different Proposal Distributions

Author: A. Jasra
A. Jasra
C. Jennison
D. Chauveau
E. Marinari
F. Campillo
F. Liang
Fabrizio Leisen
G. Celeux
J. Liu
J.K. Pritchard
K. Mengersen
M.K.P. So
N. Metropolis
N. Shephard
N.A. Heard
O. Cappé
P. Moral Del
P. Moral Del
R. Casarin
R. Casarin
R.V. Craiu
R.V. Craiu
R.V. Craiu
Radu Craiu
Roberto Casarin
S. Früwirth-Schnatter
S. Pandolfi
S. Richardson
S. Taylor
W.K. Hastings
Y. Atchadé
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We propose a new class of interacting Markov chain Monte Carlo (MCMC) algorithms designed for increasing the efficiency of a modified multiple-try Metropolis (MTM) algorithm. The extension with respect to the existing MCMC literature is twofold. The sampler proposed extends the basic MTM algorithm by allowing different proposal distributions in the multiple-try generation step. We exploit the structure of the MTM algorithm with different proposal distributions to naturally introduce an interacting MTM mechanism (IMTM) that expands the class of population Monte Carlo methods. We show the validity of the algorithm and discuss the choice of the selection weights and of the different proposals. We provide numerical studies which show that the new algorithm can perform better than the basic MTM algorithm and that the interaction mechanism allows the IMTM to efficiently explore the state space

arXiv.org e-Print Archive

Archivio Ricerca Ca'Foscari

CiteSeerX

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Kent Academic Repository

Universidad Carlos III de Madrid e-Archivo